Development of Russian lexical databases, corpora and supporting tools for speech products

نویسنده

  • Serge A. Yablonsky
چکیده

The situation with regard to Russian language resources is fragmented and disorganized. For this reason, it is important to promote for Russian the development of its basic resources in one package that could be used for development of speech products. The paper presents a design of the Russian lexical databases, corpora and supporting tools (system for construction and support of lexical databases, system for transcription, morphological analyzer and normalyzer) developed for wide usage in speech engineering.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Meaning as use: exploitation of aligned corpora for the contrastive study of lexical semantics

The paper discusses the use of corpora for experimental studies in contrastive lexical semantics, in particular, for comparing how a state of affairs is expressed in different languages and by different translators. Three topics are addressed: (1) a lexicographic database, which is aimed at storing and maintaining contrastive descriptions of a class of lexical items in several languages; (2) an...

متن کامل

Integration of Russian Language Resources

In this paper we describe the creation of large scale linguistic resources for Russian language. Internet/intranet system architecture was developed to make a large volume of Russian language lexical information, corpora (texts) and knowledge base (Russian WordNet) available to the system at development and/or run time. There are four linguistic counterparts, corresponding to the major categori...

متن کامل

Lexically Restricted Utterances in Russian, German, and English Child-Directed Speech

This study investigates the child-directed speech (CDS) of four Russian-, six German, and six English-speaking mothers to their 2-year-old children. Typologically Russian has considerably less restricted word order than either German or English, with German showing more word-order variants than English. This could lead to the prediction that the lexical restrictiveness previously found in the i...

متن کامل

Noospheric Psychological-Educational Paradigm as a Methodological Basis for Teaching Russian-Language Business Communication to Foreign Students

In the context of the polyparadigmatic system of higher education, the noospheric psychological-pedagogical paradigm is considered, on its basis a lingvodidactic model is developed for the formation of professional-communicative competence (PCC) in Russian-language business communication among foreign students. The research focuses on the basic principles of the noospheric paradigm, which procl...

متن کامل

Making Full Use of Chinese Speech Corpora

It is well understood that the speech databases play a very important role for speech recognition. It is a dream for speech recognition researchers to create more useful databases with smaller efforts. To achieve this goal, the database should be well designed at first, and tools and more information should be provided so that the databases can be made full use of. This paper will illustrate th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001